930 research outputs found

    Escaping free-energy minima

    Full text link
    We introduce a novel and powerful method for exploring the properties of the multidimensional free energy surfaces of complex many-body systems by means of a coarse-grained non-Markovian dynamics in the space defined by a few collective coordinates.A characteristic feature of this dynamics is the presence of a history-dependent potential term that, in time, fills the minima in the free energy surface, allowing the efficient exploration and accurate determination of the free energy surface as a function of the collective coordinates. We demonstrate the usefulness of this approach in the case of the dissociation of a NaCl molecule in water and in the study of the conformational changes of a dialanine in solution.Comment: 3 figure

    Clustering by fast search-and-find of density peaks

    Get PDF
    Cluster analysis is aimed at classifying elements into categories on the basis of their similarity. Its applications range from astronomy to bioinformatics, bibliometrics, and pattern recognition.We propose an approach based on the idea that cluster centers are characterized by a higher density than their neighbors and by a relatively large distance from points with higher densities. This idea forms the basis of a clustering procedure in which the number of clusters arises intuitively, outliers are automatically spotted and excluded fromthe analysis, and clusters are recognized regardless of their shape and of the dimensionality of the space inwhich they are embedded.We demonstrate the power of the algorithm on several test cases

    Predicting crystal structures: the Parrinello-Rahman method revisited

    Full text link
    By suitably adapting a recent approach [A. Laio and M. Parrinello, PNAS, 99, 12562 (2002)] we develop a powerful molecular dynamics method for the study of pressure-induced structural transformations. We use the edges of the simulation cell as collective variables. In the space of these variables we define a metadynamics that drives the system away from the local minimum towards a new crystal structure. In contrast to the Parrinello-Rahman method our approach shows no hysteresis and crystal structure transformations can occur at the equilibrium pressure. We illustrate the power of the method by studying the pressure-induced diamond to simple hexagonal phase transition in a model of silicon.Comment: 5 pages, 2 Postscript figures, submitte

    Assessing the capability of in silico mutation protocols for predicting the finite temperature conformation of amino acids

    Get PDF
    Mutation protocols are a key tool in computational biophysics for modelling unknown side chain conformations. In particular, these protocols are used to generate the starting structures for molecular dynamics simulations. The accuracy of the initial side chain and backbone placement is crucial to obtain a stable and quickly converging simulation. In this work, we assessed the performance of several mutation protocols in predicting the most probable conformer observed in finite temperature molecular dynamics simulations for a set of protein-peptide crystals differing only by single-point mutations in the peptide sequence. Our results show that several programs which predict well the crystal conformations fail to predict the most probable finite temperature configuration. Methods relying on backbone-dependent rotamer libraries have, in general, a better performance, but even the best protocol fails in predicting approximately 30% of the mutations

    Candidate Binding Sites for Allosteric Inhibition of the SARS-CoV-2 Main Protease from the Analysis of Large-Scale Molecular Dynamics Simulations

    Get PDF
    We analyzed a 100 μs MD trajectory of the SARS-CoV-2 main protease by a non-parametric data analysis approach which allows characterizing a free energy landscape as a simultaneous function of hundreds of variables. We identified several conformations that, when visited by the dynamics, are stable for several hundred nanoseconds. We explicitly characterize and describe these metastable states. In some of these configurations, the catalytic dyad is less accessible. Stabilizing them by a suitable binder could lead to an inhibition of the enzymatic activity. In our analysis we keep track of relevant contacts between residues which are selectively broken or formed in the states. Some of these contacts are formed by residues which are far from the catalytic dyad and are accessible to the solvent. Based on this analysis we propose some relevant contact patterns and three possible binding sites which could be targeted to achieve allosteric inhibition

    The intrinsic dimension of protein sequence evolution

    Get PDF
    It is well known that, in order to preserve its structure and function, a protein cannot change its sequence at random, but only by mutations occurring preferentially at specific locations. We here investigate quantitatively the amount of variability that is allowed in protein sequence evolution, by computing the intrinsic dimension (ID) of the sequences belonging to a selection of protein families. The ID is a measure of the number of independent directions that evolution can take starting from a given sequence. We find that the ID is practically constant for sequences belonging to the same family, and moreover it is very similar in different families, with values ranging between 6 and 12. These values are significantly smaller than the raw number of amino acids, confirming the importance of correlations between mutations in different sites. However, we demonstrate that correlations are not sufficient to explain the small value of the ID we observe in protein families. Indeed, we show that the ID of a set of protein sequences generated by maximum entropy models, an approach in which correlations are accounted for, is typically significantly larger than the value observed in natural protein families. We further prove that a critical factor to reproduce the natural ID is to take into consideration the phylogeny of sequences

    Intrinsic dimension of data representations in deep neural networks

    Get PDF
    Deep neural networks progressively transform their inputs across multiple processing layers. What are the geometrical properties of the representations learned by these networks? Here we study the intrinsic dimensionality (ID) of data-representations, i.e. the minimal number of parameters needed to describe a representation. We find that, in a trained network, the ID is orders of magnitude smaller than the number of units in each layer. Across layers, the ID first increases and then progressively decreases in the final layers. Remarkably, the ID of the last hidden layer predicts classification accuracy on the test set. These results can neither be found by linear dimensionality estimates (e.g., with principal component analysis), nor in representations that had been artificially linearized. They are neither found in untrained networks, nor in networks that are trained on randomized labels. This suggests that neural networks that can generalize are those that transform the data into low-dimensional, but not necessarily flat manifolds

    Metadynamics Simulations Reveal a Na+ Independent Exiting Path of Galactose for the Inward-Facing Conformation of vSGLT

    Get PDF
    Sodium-Galactose Transporter (SGLT) is a secondary active symporter which accumulates sugars into cells by using the electrochemical gradient of Na+ across the membrane. Previous computational studies provided insights into the release process of the two ligands (galactose and sodium ion) into the cytoplasm from the inward-facing conformation of Vibrio parahaemolyticus sodium/galactose transporter (vSGLT). Several aspects of the transport mechanism of this symporter remain to be clarified: (i) a detailed kinetic and thermodynamic characterization of the exit path of the two ligands is still lacking; (ii) contradictory conclusions have been drawn concerning the gating role of Y263; (iii) the role of Na+ in modulating the release path of galactose is not clear. In this work, we use bias-exchange metadynamics simulations to characterize the free energy profile of the galactose and Na+ release processes toward the intracellular side. Surprisingly, we find that the exit of Na+ and galactose is non-concerted as the cooperativity between the two ligands is associated to a transition that is not rate limiting. The dissociation barriers are of the order of 11-12 kcal/mol for both the ion and the substrate, in line with kinetic information concerning this type of transporters. On the basis of these results we propose a branched six-state alternating access mechanism, which may be shared also by other members of the LeuT-fold transporters

    Data segmentation based on the local intrinsic dimension

    Get PDF
    One of the founding paradigms of machine learning is that a small number of variables is often sufficient to describe high-dimensional data. The minimum number of variables required is called the intrinsic dimension (ID) of the data. Contrary to common intuition, there are cases where the ID varies within the same data set. This fact has been highlighted in technical discussions, but seldom exploited to analyze large data sets and obtain insight into their structure. Here we develop a robust approach to discriminate regions with different local IDs and segment the points accordingly. Our approach is computationally efficient and can be proficiently used even on large data sets. We find that many real-world data sets contain regions with widely heterogeneous dimensions. These regions host points differing in core properties: folded versus unfolded configurations in a protein molecular dynamics trajectory, active versus non-active regions in brain imaging data, and firms with different financial risk in company balance sheets. A simple topological feature, the local ID, is thus sufficient to achieve an unsupervised segmentation of high-dimensional data, complementary to the one given by clustering algorithms
    corecore